Provably E cient Scheduling for Languages with Fine Grained Parallelism
نویسندگان
چکیده
Many high level parallel programming languages allow for ne grained parallelism As in the popular work time frame work for parallel algorithm design programs written in such languages can express the full parallelism in the program without specifying the mapping of program tasks to proces sors A common concern in executing such programs is to schedule tasks to processors dynamically so as to minimize not only the execution time but also the amount of space memory needed Without careful scheduling the paral lel execution on p processors can use a factor of p or larger more space than a sequential implementation of the same program This paper rst identi es a class of parallel schedules that are provably e cient in both time and space For any computation with w units of work and critical path length d and for any sequential schedule that takes space s we provide a parallel schedule that takes fewer than w p d steps on p processors and requires less than s p d space This matches the lower bound that we show and signi cantly improves upon the best previous bound of s p space for the common case where d s The paper then describes a scheduler for implementing high level languages with nested parallelism that generates schedules in this class During program execution as the structure of the computation is revealed the scheduler keeps track of the active tasks allocates the tasks to the proces sors and performs the necessary task synchronization The scheduler is itself a parallel algorithm and incurs at most a constant factor overhead in time and space even when the scheduling granularity is individual units of work The algo rithm is the rst e cient solution to the scheduling problem discussed here even if space considerations are ignored Our results apply to a variety of memory allocation schemes in programming languages stack allocation explicit heap management implicit heap management Space allocation is modeled as various games on weighted and group weighted directed acyclic graphs dags Our space bounds are ob A preliminary version of this work appears in the Proceedings of the th Annual ACM Symposium on Parallel Algorithms and Architectures Santa Barbara Calif ACM New York July pp tained by proving properties relating parallel schedules and parallel pebble games on arbitrary dags to their sequential counterparts The scheduler algorithm relies on properties we prove for planar and series parallel dags
منابع مشابه
Provably Eecient Scheduling for Languages with Fine-grained Parallelism
Many high-level parallel programming languages allow for ne-grained parallelism. As in the popular work-time framework for parallel algorithm design, programs written in such languages can express the full parallelism in the program without specifying the mapping of program tasks to processors. A common concern in executing such programs is to dynamically schedule tasks to processors so as to n...
متن کاملcient Scheduling of Nested ParallelismGIRIJA
Many of today's high level parallel languages support dynamic, ne-grained parallelism. These languages allow the user to expose all the parallelism in the program, which is typically of a much higher degree than the number of processors. Hence an eecient scheduling algorithm is required to assign computations to processors at runtime. Besides having low overheads and good load balancing, it is ...
متن کاملA Framework for Space and Time Efficient Scheduling of Parallelism
Many of today’s high level parallel languages support dynamic, fine-grained parallelism. These languages allow the user to expose all the parallelism in the program, which is typically of a much higher degree than the number of processors. Hence an efficient scheduling algorithm is required to assign computations to processors at runtime. Besides having low overheads and good load balancing, it...
متن کاملBalancing Fine- and Medium-Grained Parallelism in Scheduling Loops for the XIMD Architecture
This paper presents an approach to scheduling loops that leverages the distinctive architectural features of the XIMD, particularly the variable number of instruction streams and low synchronization cost. The classical VLIW and MIMD architectures have a fixed number of instruction streams, each with a fixed width. A compiler for the XIMD architecture can exploit fine-grained parallelism within ...
متن کاملSemantics-based parallel cost models and their use in provably efficient implementations
Understanding the performance issues of modern programming language execution can be di cult. These languages have abstract features, such as higher-order functions, laziness, and objects, that ease programming, but which make their mapping to the underlying machine more di cult. Understanding parallel languages is further complicated by the need to describe what computations are performed in p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1995